Multi-Scale Channel Adaptive Time-Delay Neural Network and Balanced Fine-Tuning for Arabic Dialect Identification
نویسندگان
چکیده
The time-delay neural network (TDNN) can consider multiple frames of information simultaneously, making it particularly suitable for dialect identification. However, previous TDNN architectures have focused on only one aspect either the temporal or channel information, lacking a unified optimization both domains. We believe that extracting appropriate contextual and enhancing channels are critical Therefore, in this paper, we propose novel approach uses ECAPA-TDNN from speaker recognition domain as backbone introduce new multi-scale adaptive module (MSCA-Res2Block) to construct (MSCA-TDNN). MSCA-Res2Block is capable features, thus further enlarging receptive field convolutional operations. evaluated our proposed method ADI17 Arabic dataset employed balanced fine-tuning strategy address issue imbalanced datasets, well Z-Score normalization eliminate score distribution differences among different dialects. After experimental validation, system achieved an average cost performance (Cavg) 4.19% 94.28% accuracy rate. Compared ECAPA-TDNN, model showed 22% relative improvement Cavg. Furthermore, outperformed state-of-the-art single-network reported competition. In comparison best-performing multi-network hybrid competition, Cavg also exhibited advantage.
منابع مشابه
A Time Delay Neural Network for Online Arabic Handwriting Recognition
Handwriting recognition is an interesting part in pattern recognition field. In the last decade, several approaches are focused on online handwriting recognition because the very rapid growth of new technologies in the field of data entry. In this paper, we propose a new system for online Arabic handwriting recognition based on beta-elliptic model which allow to segment the trajectory into segm...
متن کاملDecentralized Adaptive Control of Large-Scale Non-affine Nonlinear Time-Delay Systems Using Neural Networks
In this paper, a decentralized adaptive neural controller is proposed for a class of large-scale nonlinear systems with unknown nonlinear, non-affine subsystems and unknown nonlinear time-delay interconnections. The stability of the closed loop system is guaranteed through Lyapunov-Krasovskii stability analysis. Simulation results are provided to show the effectiveness of the proposed approache...
متن کاملArabic Dialect Identification
The written form of the Arabic language, Modern Standard Arabic (MSA), differs in a nontrivial manner from the various spoken regional dialects of Arabic – the true “native languages” of Arabic speakers. Those dialects, in turn, differ quite a bit from each other. However, due to MSA’s prevalence in written form, almost all Arabic datasets have predominantly MSA content. In this article, we des...
متن کاملVerifiably Effective Arabic Dialect Identification
Several recent papers on Arabic dialect identification have hinted that using a word unigram model is sufficient and effective for the task. However, most previous work was done on a standard fairly homogeneous dataset of dialectal user comments. In this paper, we show that training on the standard dataset does not generalize, because a unigram model may be tuned to topics in the comments and d...
متن کاملFine Tuning Multi-Channel Compression Hearing Instruments
instruments are devices that provide independent non-linear signal processing in many discrete frequency regions. Multichannel capability allows wearers to understand speech better in certain noisy environments. For example, in pediatric fittings, it helps them to monitor their own voice and assists in their speech production. For the dispensing professional, the capabilities of multi-channel i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13074233